GATE PYQ

Computer Organization

Q231.
We have two designs D1 and D2 for a synchronous pipeline processor. D1 has 5 pipeline stages with execution times of 3 nsec, 2 nsec, 4 nsec, 2 nsec and 3 nsec while the design D2 has 8 pipeline stages each with 2 nsec execution time How much time can be saved using design D2 over design D1 for executing 100 instructions?

Q232.
A pipelined processor uses a 4-stage instruction pipeline with the following stages: Instruction fetch (IF), Instruction decode (ID), Execute (EX) and Writeback (WB). The arithmetic operations as well as the load and store operations are carried out in the EX stage. The sequence of instructions corresponding to the statement X = (S - R * (P + Q))/T is given below. The values of variables P, Q, R, S and T are available in the registers R0, R1, R2, R3 and R4 respectively, before the execution of the instruction sequence. \begin{array}{lll} \text{ADD} & \text{$R5,R0,R1$} & \text{$;R5$} \leftarrow \text{R0 + R1} \\ \text{MUL}& \text{$R6,R2,R5$} & \text{$;R6$} \leftarrow \text{R2 * R5} \\ \text{SUB} & \text{$R5,R3,R6$} & \text{$;R5$} \leftarrow \text{R3 -R6} \\ \text{DIV} &\text{$R6,R5,R4$} & \text{$;R6$} \leftarrow \text{R5/R4} \\ \text{STORE} &\text{$R6,X$}& \text{$;X$} \leftarrow \text{R6} \\ \end{array} The number of Read-After-Write (RAW) dependencies, Write-After-Read( WAR) dependencies, and Write-After-Write (WAW) dependencies in the sequence of instructions are, respectively,

Q233.
A 5-stage pipelined processor has Instruction Fetch (IF), Instruction Decode (ID), Operand Fetch (OF), Perform Operation (PO) and Write Operand (WO) stages. The IF, ID, OF and WO stages take 1 clock cycle each for any instruction. The PO stage takes 1 clock cycle for ADD and SUB instructions, 3 clock cycles for MUL instruction, and 6 clock cycles for DIV instruction respectively. Operand forwarding is used in the pipeline. What is the number of clock cycles needed to execute the following sequence of instructions?

Q234.
A pipelined processor uses a 4-stage instruction pipeline with the following stages: Instruction fetch (IF), Instruction decode (ID), Execute (EX) and Writeback (WB). The arithmetic operations as well as the load and store operations are carried out in the EX stage. The sequence of instructions corresponding to the statement X = (S - R * (P + Q))/T is given below. The values of variables P, Q, R, S and T are available in the registers R0, R1, R2, R3 and R4 respectively, before the execution of the instruction sequence. \begin{array}{lll} \text{ADD} & \text{$R5,R0,R1$} & \text{$;R5$} \leftarrow \text{R0 + R1} \\ \text{MUL}& \text{$R6,R2,R5$} & \text{$;R6$} \leftarrow \text{R2 * R5} \\ \text{SUB} & \text{$R5,R3,R6$} & \text{$;R5$} \leftarrow \text{R3 -R6} \\ \text{DIV} &\text{$R6,R5,R4$} & \text{$;R6$} \leftarrow \text{R5/R4} \\ \text{STORE} &\text{$R6,X$}& \text{$;X$} \leftarrow \text{R6} \\ \end{array} The IF, ID and WB stages take 1 clock cycle each. The EX stage takes 1 clock cycle each for the ADD, SUB and STORE operations, and 3 clock cycles each for MUL and DIV operations. Operand forwarding from the EX stage to the ID stage is used. The number of clock cycles required to complete the sequence of instructions is

Q235.
Delayed branching can help in the handling of control hazards The following code is to run on a pipelined processor with one branch delay slot: I1: ADD \leftarrowR2 R7 +R8 I2 : SUB R4 \leftarrowR5 - R6 I3: ADD R1 \leftarrow R2 + R3 I4 : STORE Memory [R4] \leftarrow R1 BRANCH to Label if R1==0 Which of the instructions I1, I2, I3 or I4 can legitimately occupy the delay slot without any other program modification?

Q236.
A non pipelined single cycle processor operating at 100 MHz is converted into a synchronous pipelined processor with five stages requiring 2.5 nsec, 1.5 nsec, 2 nsec, 1.5 nsec and 2.5 nsec, respectively. The delay of the latches is 0.5 nsec. The speedup of the pipeline processor for a large number of instructions is:

Q237.
Which of the following are NOT true in a pipelined processor? I. Bypassing can handle all RAW hazards II. Register renaming can eliminate all register carried WAR hazards III. Control hazard penalties can be eliminated by dynamic branch prediction

Q238.
Consider two processors P1 and P2 executing the same instruction set. Assume that under identical conditions, for the same input, a program running on P2 takes 25% less time but incurs 20% more CPI (clock cycles per instruction) as compared to the program running on P1. If the clock frequency of P1 is 1GHz, then the clock frequency of P2 (in GHz) is _________.

Q239.
A processor takes 12 cycles to complete an instruction I. The corresponding pipelined processor uses 6 stages with the execution times of 3,2,5,4,6 and 2 cycles respectively. What is the asymptotic speedup assuming that a very large number of instructions are to be executed?

Q240.
Consider a 4 stage pipeline processor. The number of cycles needed by the four instructions I1, I2, I3, I4 in stages S1, S2, S3, S4 is shown below: What is the number of cycles needed to execute the following loop? For (i=1 to 2) {I1; I2; I3; I4;}